Jarque–Bera normality test (jarque_bera)#
The Jarque–Bera (JB) test is a classic normality test built around two facts that must hold if data are normally distributed:
Skewness should be 0 (symmetry)
Kurtosis should be 3 (tail thickness; equivalently excess kurtosis should be 0)
It’s often used as a quick check (especially on model residuals) to see whether the Gaussian assumption is plausible.
Learning goals#
Understand what the JB statistic measures (skewness + kurtosis departures from normal)
Implement the test from scratch with NumPy only
Interpret the statistic and p-value correctly
Build intuition via Plotly visuals (histograms, Q–Q plots, and the null distribution)
Prerequisites#
Sample mean/variance and central moments
Basic hypothesis testing (null vs alternative, p-values)
The chi-square distribution (JB is asymptotically χ² with 2 degrees of freedom)
import numpy as np
import plotly
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio
from plotly.subplots import make_subplots
pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
np.set_printoptions(precision=6, suppress=True)
rng = np.random.default_rng(42)
VERSIONS = {"numpy": np.__version__, "plotly": plotly.__version__}
VERSIONS
{'numpy': '1.26.2', 'plotly': '6.5.2'}
1) What the test is for (and when to use it)#
What it tests#
The JB test is a hypothesis test of normality:
H₀ (null): the data are generated by a normal distribution
H₁ (alternative): the data are not normal
The test doesn’t try to match the entire shape of the distribution. It focuses on two features:
asymmetry (skewness)
tail heaviness / peakedness (kurtosis)
So the JB test is most informative when the key deviations you care about are skewness and/or tail behavior.
Where it shows up in practice#
Common use cases:
Residual diagnostics (linear regression, ARIMA, etc.)
Sanity checks before using methods that assume normality (or normal residuals)
Quick comparisons of “how non-normal” different samples look in terms of skewness and kurtosis
What it is not#
It is not a proof of normality. A large p-value means “insufficient evidence to reject normality”, not “the data are normal”.
It is not a comprehensive shape test: some non-normal distributions can have skewness near 0 and kurtosis near 3 (JB may miss them).
2) The Jarque–Bera statistic#
Let \(x_1,\dots,x_n\) be a sample.
Define the sample central moments
From these, define the (moment-based) sample skewness \(S\) and sample kurtosis \(K\):
For a normal distribution, \(S=0\) and \(K=3\).
The Jarque–Bera statistic combines deviations in both quantities:
Asymptotic null distribution and p-value#
Under \(H_0\) (normality), as \(n \to \infty\):
So the p-value is
For 2 degrees of freedom, the survival function has a simple closed form:
That means for JB we can compute the p-value with just np.exp (no SciPy).
def _to_1d_finite(x: np.ndarray) -> np.ndarray:
x = np.asarray(x, dtype=float).reshape(-1)
return x[np.isfinite(x)]
def central_moment(x: np.ndarray, k: int) -> float:
# Central moment m_k = mean((x - mean(x))^k)
if k < 0:
raise ValueError("k must be non-negative")
x = _to_1d_finite(x)
if x.size == 0:
return float("nan")
if k == 0:
return 1.0
mean = float(x.mean())
return float(np.mean((x - mean) ** k))
def sample_skewness_kurtosis(x: np.ndarray) -> tuple[float, float]:
# Moment-based sample skewness S and kurtosis K (normal => K=3).
x = _to_1d_finite(x)
if x.size < 3:
raise ValueError("Need at least 3 finite values")
m2 = central_moment(x, 2)
if not np.isfinite(m2) or m2 <= 0:
return float("nan"), float("nan")
m3 = central_moment(x, 3)
m4 = central_moment(x, 4)
s = m3 / (m2 ** 1.5)
k = m4 / (m2**2)
return float(s), float(k)
def jarque_bera_test(x: np.ndarray) -> dict:
# Jarque–Bera normality test (asymptotic χ² with df=2).
x = _to_1d_finite(x)
n = int(x.size)
if n < 3:
raise ValueError("Need at least 3 finite values")
s, k = sample_skewness_kurtosis(x)
if not (np.isfinite(s) and np.isfinite(k)):
return {
"n": n,
"statistic": float("nan"),
"p_value": float("nan"),
"skewness": float("nan"),
"kurtosis": float("nan"),
"excess_kurtosis": float("nan"),
}
jb = (n / 6.0) * (s**2 + ((k - 3.0) ** 2) / 4.0)
# Under H0: JB ~ chi-square(df=2) asymptotically.
# For df=2, survival function is exp(-x/2).
p_value = float(np.exp(-0.5 * jb))
return {
"n": n,
"statistic": float(jb),
"p_value": p_value,
"skewness": float(s),
"kurtosis": float(k),
"excess_kurtosis": float(k - 3.0),
"skew_component": float((n / 6.0) * (s**2)),
"kurtosis_component": float((n / 24.0) * ((k - 3.0) ** 2)),
}
# Quick check on a few synthetic samples
samples = {
"normal": rng.standard_normal(500),
"t(df=3)": rng.standard_t(df=3, size=500),
"exponential": rng.exponential(scale=1.0, size=500),
}
for name, x in samples.items():
res = jarque_bera_test(x)
print(
f"{name:>12s} | JB={res['statistic']:.3f} | p={res['p_value']:.4f} | "
f"skew={res['skewness']:.3f} | excess_kurt={res['excess_kurtosis']:.3f}"
)
normal | JB=0.783 | p=0.6762 | skew=0.093 | excess_kurt=-0.052
t(df=3) | JB=2202.120 | p=0.0000 | skew=0.243 | excess_kurt=10.270
exponential | JB=2584.447 | p=0.0000 | skew=2.379 | excess_kurt=10.071
3) How to interpret the result#
Pick a significance level \(\alpha\) (often 0.05).
If p-value ≤ α: reject \(H_0\) → the sample shows statistically significant evidence of non-normality (in skewness and/or kurtosis).
If p-value > α: fail to reject \(H_0\) → you do not have enough evidence (from skewness/kurtosis) to say it’s non-normal.
Two important interpretation details:
JB is an asymptotic test. For small samples, the χ² approximation can be rough.
With large n, tiny deviations become significant. In big datasets it’s common to reject normality even when the deviation is practically irrelevant.
So: always pair the p-value with effect size diagnostics (skewness, excess kurtosis) and plots.
def chi2_df2_pdf(x: np.ndarray) -> np.ndarray:
x = np.asarray(x, dtype=float)
out = np.zeros_like(x)
mask = x >= 0
out[mask] = 0.5 * np.exp(-0.5 * x[mask])
return out
def chi2_df2_sf(x: np.ndarray) -> np.ndarray:
x = np.asarray(x, dtype=float)
out = np.ones_like(x)
mask = x >= 0
out[mask] = np.exp(-0.5 * x[mask])
return out
def plot_jb_pvalue(jb: float, *, title: str = "JB vs χ²(2) (right-tail p-value)") -> go.Figure:
xmax = float(max(20.0, jb * 1.5))
xs = np.linspace(0.0, xmax, 600)
fig = go.Figure()
fig.add_trace(go.Scatter(x=xs, y=chi2_df2_pdf(xs), mode="lines", name="χ²(2) PDF"))
xs_tail = xs[xs >= jb]
fig.add_trace(
go.Scatter(
x=np.r_[xs_tail, xs_tail[::-1]],
y=np.r_[chi2_df2_pdf(xs_tail), np.zeros_like(xs_tail)],
fill="toself",
fillcolor="rgba(31, 119, 180, 0.2)",
line=dict(color="rgba(0,0,0,0)"),
hoverinfo="skip",
name="p-value area",
showlegend=True,
)
)
p = float(chi2_df2_sf(jb))
fig.add_vline(x=jb, line_width=2, line_dash="dash", line_color="firebrick")
fig.update_layout(
title=f"{title}<br><sup>JB={jb:.3f}, p-value={p:.4f}</sup>",
xaxis_title="x",
yaxis_title="density",
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="left", x=0.0),
)
return fig
res_norm = jarque_bera_test(samples["normal"])
plot_jb_pvalue(res_norm["statistic"], title="Normal sample: JB null distribution")
# 4) The null distribution: simulated JB values under true normality
def simulate_jb_under_normal(*, n: int, m: int, rng: np.random.Generator) -> np.ndarray:
jb_vals = np.empty(m, dtype=float)
for i in range(m):
x = rng.standard_normal(n)
jb_vals[i] = jarque_bera_test(x)["statistic"]
return jb_vals
n = 200
m = 5000
jb_vals = simulate_jb_under_normal(n=n, m=m, rng=rng)
xs = np.linspace(0.0, np.quantile(jb_vals, 0.995), 600)
alpha = 0.05
crit = float(-2.0 * np.log(alpha)) # df=2
rejection_rate = float(np.mean(jb_vals >= crit))
fig = go.Figure()
fig.add_trace(go.Histogram(x=jb_vals, nbinsx=60, histnorm="probability density", name="simulated JB"))
fig.add_trace(go.Scatter(x=xs, y=chi2_df2_pdf(xs), mode="lines", name="χ²(2) PDF"))
fig.add_vline(x=crit, line_dash="dash", line_color="firebrick")
fig.update_layout(
title=(
"JB under H₀ (true normality)" +
f"<br><sup>n={n}, simulations={m}, α={alpha} ⇒ critical={crit:.3f}, empirical rejection≈{rejection_rate:.3f}</sup>"
),
xaxis_title="JB statistic",
yaxis_title="density",
)
fig
5) What JB is “measuring” (geometry in skewness/kurtosis space)#
Because
JB is basically a scaled squared distance from the normal point \((S, K-3) = (0, 0)\).
Moving away from 0 skewness increases JB.
Moving away from 0 excess kurtosis increases JB.
The same p-value corresponds (approximately) to an ellipse in \((S, K-3)\) space.
The plot below shows this idea.
# A contour view of JB as a function of (skewness, excess kurtosis)
n_for_geometry = 200
s_grid = np.linspace(-1.2, 1.2, 241)
e_grid = np.linspace(-2.5, 2.5, 241) # excess kurtosis
S, E = np.meshgrid(s_grid, e_grid)
JB_grid = (n_for_geometry / 6.0) * (S**2 + (E**2) / 4.0)
alpha_levels = [0.10, 0.05, 0.01]
crit_levels = {a: float(-2.0 * np.log(a)) for a in alpha_levels} # df=2
fig = go.Figure()
fig.add_trace(
go.Contour(
x=s_grid,
y=e_grid,
z=JB_grid,
contours=dict(coloring="lines", showlabels=True),
line=dict(width=1),
showscale=False,
hovertemplate="skew=%{x:.3f}<br>excess kurt=%{y:.3f}<br>JB≈%{z:.3f}<extra></extra>",
)
)
for a, q in crit_levels.items():
rhs = (6.0 * q) / n_for_geometry
s_line = np.linspace(-np.sqrt(rhs), np.sqrt(rhs), 400)
e_line = 2.0 * np.sqrt(np.maximum(0.0, rhs - s_line**2))
fig.add_trace(go.Scatter(x=s_line, y=e_line, mode="lines", name=f"α={a} boundary"))
fig.add_trace(go.Scatter(x=s_line, y=-e_line, mode="lines", showlegend=False))
pts = []
for name, x in samples.items():
r = jarque_bera_test(x)
pts.append((name, r["skewness"], r["excess_kurtosis"]))
fig.add_trace(
go.Scatter(
x=[p[1] for p in pts],
y=[p[2] for p in pts],
mode="markers+text",
text=[p[0] for p in pts],
textposition="top center",
marker=dict(size=10, color="firebrick"),
name="example samples",
hovertemplate="%{text}<br>skew=%{x:.3f}<br>excess kurt=%{y:.3f}<extra></extra>",
)
)
fig.update_layout(
title=(
"JB statistic as a function of skewness and excess kurtosis"
+ f"<br><sup>n={n_for_geometry}; JB contours + approximate rejection boundaries</sup>"
),
xaxis_title="skewness S",
yaxis_title="excess kurtosis (K − 3)",
)
fig
6) Visual intuition: histogram + Q–Q plot#
A good workflow is:
Run JB (get p-value, skewness, excess kurtosis)
Inspect a standardized histogram with a normal overlay
Inspect a Q–Q plot (systematic curvature indicates non-normal tails; S-shaped patterns indicate skew)
Below we compare three distributions after z-scoring (mean 0, std 1).
def norm_pdf(x: np.ndarray) -> np.ndarray:
x = np.asarray(x, dtype=float)
return (1.0 / np.sqrt(2.0 * np.pi)) * np.exp(-0.5 * x**2)
def norm_ppf(p: np.ndarray) -> np.ndarray:
# Approximate inverse CDF of the standard normal distribution.
# Vectorized rational approximation (Acklam). Returns NaN for p outside (0, 1).
p = np.asarray(p, dtype=float)
x = np.full_like(p, np.nan, dtype=float)
a = np.array(
[
-39.69683028665376,
220.9460984245205,
-275.9285104469687,
138.3577518672690,
-30.66479806614716,
2.506628277459239,
]
)
b = np.array(
[
-54.47609879822406,
161.5858368580409,
-155.6989798598866,
66.80131188771972,
-13.28068155288572,
]
)
c = np.array(
[
-0.007784894002430293,
-0.3223964580411365,
-2.400758277161838,
-2.549732539343734,
4.374664141464968,
2.938163982698783,
]
)
d = np.array(
[
0.007784695709041462,
0.3224671290700398,
2.445134137142996,
3.754408661907416,
]
)
plow = 0.02425
phigh = 1.0 - plow
valid = (p > 0.0) & (p < 1.0)
mask = valid & (p < plow)
if np.any(mask):
q = np.sqrt(-2.0 * np.log(p[mask]))
num = (((((c[0] * q + c[1]) * q + c[2]) * q + c[3]) * q + c[4]) * q + c[5])
den = ((((d[0] * q + d[1]) * q + d[2]) * q + d[3]) * q + 1.0)
x[mask] = -(num / den)
mask = valid & (p >= plow) & (p <= phigh)
if np.any(mask):
q = p[mask] - 0.5
r = q * q
num = (((((a[0] * r + a[1]) * r + a[2]) * r + a[3]) * r + a[4]) * r + a[5]) * q
den = (((((b[0] * r + b[1]) * r + b[2]) * r + b[3]) * r + b[4]) * r + 1.0)
x[mask] = num / den
mask = valid & (p > phigh)
if np.any(mask):
q = np.sqrt(-2.0 * np.log(1.0 - p[mask]))
num = (((((c[0] * q + c[1]) * q + c[2]) * q + c[3]) * q + c[4]) * q + c[5])
den = ((((d[0] * q + d[1]) * q + d[2]) * q + d[3]) * q + 1.0)
x[mask] = num / den
return x
def zscore(x: np.ndarray) -> np.ndarray:
x = _to_1d_finite(x)
mu = x.mean()
sigma = x.std(ddof=0)
return (x - mu) / sigma
labels = list(samples.keys())
z_samples = {k: zscore(v) for k, v in samples.items()}
results = {k: jarque_bera_test(v) for k, v in z_samples.items()}
subplot_titles = [
f"{k}<br><sup>JB={results[k]['statistic']:.2f}, p={results[k]['p_value']:.4f}</sup>" for k in labels
]
fig = make_subplots(
rows=2,
cols=3,
subplot_titles=subplot_titles * 2,
vertical_spacing=0.12,
row_heights=[0.55, 0.45],
)
x_grid = np.linspace(-4.0, 4.0, 500)
for j, k in enumerate(labels, start=1):
z = z_samples[k]
fig.add_trace(
go.Histogram(x=z, nbinsx=45, histnorm="probability density", name=f"{k} hist", showlegend=False),
row=1,
col=j,
)
fig.add_trace(
go.Scatter(
x=x_grid,
y=norm_pdf(x_grid),
mode="lines",
line=dict(color="black"),
name="N(0,1)",
showlegend=(j == 1),
),
row=1,
col=j,
)
z_sorted = np.sort(z)
n = z_sorted.size
p = (np.arange(1, n + 1) - 0.5) / n
q = norm_ppf(p)
fig.add_trace(
go.Scatter(x=q, y=z_sorted, mode="markers", marker=dict(size=4), name=f"{k} QQ", showlegend=False),
row=2,
col=j,
)
fig.add_trace(
go.Scatter(
x=[q.min(), q.max()],
y=[q.min(), q.max()],
mode="lines",
line=dict(color="firebrick", dash="dash"),
name="y=x",
showlegend=(j == 1),
),
row=2,
col=j,
)
for j in range(1, 4):
fig.update_xaxes(title_text="z", row=1, col=j)
fig.update_yaxes(title_text="density", row=1, col=j)
fig.update_xaxes(title_text="theoretical quantiles (N(0,1))", row=2, col=j)
fig.update_yaxes(title_text="sample quantiles (z-scored)", row=2, col=j)
fig.update_layout(title="Distributions (z-scored): histogram + Q–Q plot", height=850)
fig
7) Practical notes & pitfalls#
Asymptotic nature: JB relies on a large-sample χ² approximation. For small \(n\), use JB as a hint and rely more on plots or small-sample tests.
Outliers: skewness and kurtosis are highly sensitive to outliers → a single extreme value can drive rejection.
Power and misspecification: JB is tuned to skewness/kurtosis. It can have low power against other deviations (e.g., some symmetric mixtures).
Big n: with enough data, any small deviation becomes statistically significant. Always check whether non-normality is practically important.
A good “diagnostics bundle” is: JB p-value + (skewness, excess kurtosis) + histogram + Q–Q plot.
Exercises#
Simulate JB rejection rates under \(H_0\) for different sample sizes (\(n=20,50,200,1000\)). How quickly does the χ² approximation settle?
Build two non-normal samples with skewness near 0 and kurtosis near 3 (e.g. symmetric mixtures) and see whether JB detects them.
Apply JB to regression residuals from a simple linear model with (a) Gaussian noise and (b) heavy-tailed noise. Compare with Q–Q plots.
References#
Jarque, C. M., & Bera, A. K. (1980). Efficient tests for normality, homoscedasticity and serial independence of regression residuals.
Jarque, C. M., & Bera, A. K. (1987). A test for normality of observations and regression residuals.